Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EventStreams: Customisable Terminating Byte Sequence #115

Merged
merged 26 commits into from
Oct 3, 2024

Conversation

paulhdk
Copy link
Contributor

@paulhdk paulhdk commented Sep 17, 2024

Motivation

As discussed in apple/swift-openapi-generator#622, some APIs, e.g., ChatGPT or Claude, may return a non-JSON byte sequence to terminate a stream of events.
If not handled with a workaround (see below)such non-JSON terminating byte sequences cause a decoding error.

Modifications

This PR adds the ability to customise the terminating byte sequence by providing a closure to asDecodedServerSentEvents() as well as asDecodedServerSentEventsWithJSONData() that can match incoming data for the terminating byte sequence before it is decoded into JSON, for instance.

Result

Instead of having to decode and re-encode incoming events to filter out the terminating byte sequence - as seen in apple/swift-openapi-generator#622 (comment) - terminating byte sequences can now be cleanly caught by either providing a closure or providing the terminating byte sequence directly when calling asDecodedServerSentEvents() and asDecodedServerSentEventsWithJSONData().

Test Plan

This PR includes unit tests that test the new function parameters as part of the existing tests for asDecodedServerSentEvents() as well as asDecodedServerSentEventsWithJSONData().

@paulhdk paulhdk changed the title Customisable terminating byte sequence EventStreams: Customisable Terminating Byte Sequence Sep 17, 2024
Copy link
Collaborator

@simonjbeaumont simonjbeaumont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a courtesy note to acknowledge this PR—thanks for getting stuck in!

I’m out sick so might not get to this immediately. A quick glance: you’ll need to not remove the existing public initialiser (note that adding a parameter to a public method is technically API breaking because you can refer to functions by their signature.

@czechboy0
Copy link
Collaborator

Thanks @paulhdk, the general shape of this is correct, let's just polish it up a bit together.

  1. As Si mentioned, changing (even adding) function parameters is an API-breaking change, so we need to add the old variants to Deprecated.swift and mark it as deprecated (you'll see the many other deprecated methods in that name, just follow that pattern). That'll fix the first piece of feedback.
  2. While the method you used to terminate the sequence should work, I think there's a simpler way to do this: AsyncSequence contains the prefix(while:) method, which is basically exactly what we need - we want the sequence to be iterated until a specific element is encountered, at which point we want to stop. That's equivalent to the prefix(while:) method, just with the test closure inverted (stop when the element matches a pattern). That means that you don't need to change the two implementation-only async sequences at all, you'll just need to take the input sequence and call the prefix(while:) method on it, and provide that. Let me try to add a suggestion on the diff to illustrate this better.
  3. The naming of the parameter is interesting. terminate: is fine, but I think we can make it even clearer. terminator is used by print, but in that case, it adds the terminator to the rendered string, whereas here the parameter dictates when to stop iterating the elements. It might still work as a parameter name for the variant that provides ArraySlice<UInt8>, and terminateOn: for the (ArraySlice<UInt8>) -> Bool variant, but I'm happy for us to bikeshed that here a bit more, I hope we can make the name even better.

Copy link
Collaborator

@czechboy0 czechboy0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above

@paulhdk
Copy link
Contributor Author

paulhdk commented Sep 17, 2024

Just a courtesy note to acknowledge this PR—thanks for getting stuck in!

Thank you both for all your help along the way!

I’m out sick so might not get to this immediately. A quick glance: you’ll need to not remove the existing public initialiser (note that adding a parameter to a public method is technically API breaking because you can refer to functions by their signature.

Thanks @paulhdk, the general shape of this is correct, let's just polish it up a bit together.

  1. As Si mentioned, changing (even adding) function parameters is an API-breaking change, so we need to add the old variants to Deprecated.swift and mark it as deprecated (you'll see the many other deprecated methods in that name, just follow that pattern). That'll fix the first piece of feedback.

Ohh, I wasn't aware. But hadn't we discussed as an initial requirement that this PR shouldn't change the existing API? I thought about providing the changes of this PR in separate functions but I wasn't sure how to avoid duplicate code in that case. Is there an elegant way to preserve the existing API with this PR's changes?

  1. While the method you used to terminate the sequence should work, I think there's a simpler way to do this: AsyncSequence contains the prefix(while:) method, which is basically exactly what we need - we want the sequence to be iterated until a specific element is encountered, at which point we want to stop. That's equivalent to the prefix(while:) method, just with the test closure inverted (stop when the element matches a pattern). That means that you don't need to change the two implementation-only async sequences at all, you'll just need to take the input sequence and call the prefix(while:) method on it, and provide that. Let me try to add a suggestion on the diff to illustrate this better.

Not sure if I'm missing something, but I'm not seeing your comment in the code? I just see "See comment above" but is not attached to any line of code.

  1. The naming of the parameter is interesting. terminate: is fine, but I think we can make it even clearer. terminator is used by print, but in that case, it adds the terminator to the rendered string, whereas here the parameter dictates when to stop iterating the elements. It might still work as a parameter name for the variant that provides ArraySlice<UInt8>, and terminateOn: for the (ArraySlice<UInt8>) -> Bool variant, but I'm happy for us to bikeshed that here a bit more, I hope we can make the name even better.

Hm. What about processUntil: {}? In that case we would have to invert the return value of the closure though.
Or maybe terminatingByteReceived: {}?

@czechboy0
Copy link
Collaborator

Ohh, I wasn't aware. But hadn't we discussed as an initial requirement that this PR shouldn't change the existing API? I thought about providing the changes of this PR in separate functions but I wasn't sure how to avoid duplicate code in that case. Is there an elegant way to preserve the existing API with this PR's changes?

The usual steps we take when replacing an API while preserving it in a deprecated form:

  1. Move the existing function into Deprecated.swift
  2. Mark it as deprecated
  3. Copy it into the original location and make the modification (you've already done this)

We can usually manage to have the deprecated variant call into the new variant. In this case, I believe we should be able to make the deprecated variant call the new variant with a { false } closure for the terminator, meaning it'll never terminate prematurely and will always run through the full sequence.

Not sure if I'm missing something, but I'm not seeing your comment in the code? I just see "See comment above" but is not attached to any line of code.

Apologies, I got distracted and forgot to add the comment 🤦‍♂️ Let me do that now.

Hm. What about processUntil: {}? In that case we would have to invert the return value of the closure though.

Yeah I expect us to need to invert the value if we use prefix(while:) under the hood, but that's okay I think.

Since it's a closure with a boolean result, maybe it could be called isLastElement: (ArraySlice<UInt8>) -> Bool = { false }? Or shouldStop? I'll let @simonjbeaumont help us with the bikeshedding here 🙂

@simonjbeaumont simonjbeaumont added the semver/minor Adds new public API. label Sep 18, 2024
@paulhdk paulhdk force-pushed the customisable-terminating-byte-sequence branch 2 times, most recently from 9957e84 to 1487eda Compare September 20, 2024 18:02
@paulhdk
Copy link
Contributor Author

paulhdk commented Sep 30, 2024

@czechboy0 @simonjbeaumont just giving this a polite bump 🙂

I believe I’ve addressed all of your feedback so far. I just need clarification on one point above, @czechboy0.

Let me know if there’s anything missing.

Copy link
Collaborator

@czechboy0 czechboy0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping, @paulhdk - I took a more detailed look and suggested a few more changes, mainly around the details of the sequence's state machine.

@paulhdk paulhdk force-pushed the customisable-terminating-byte-sequence branch from a2b9089 to f14b5e8 Compare October 1, 2024 16:32
@paulhdk
Copy link
Contributor Author

paulhdk commented Oct 1, 2024

Thanks for the ping, @paulhdk - I took a more detailed look and suggested a few more changes, mainly around the details of the sequence's state machine.

Thanks for the review, @czechboy0! I’ve addressed everything and ran into some questions. See above.

Copy link
Collaborator

@czechboy0 czechboy0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more minor comments, otherwise lgtm

@simonjbeaumont
Copy link
Collaborator

Just a courtesy note to all authors of open PRs. We have just cut over to GitHub Actions for CI so you will see different checks coming past. There is also a note in the CONTRIBUTING.md on how to run these locally: https://github.com/apple/swift-openapi-runtime/blob/main/CONTRIBUTING.md#run-ci-checks-locally.

@paulhdk paulhdk force-pushed the customisable-terminating-byte-sequence branch from a770b5f to 0d4345e Compare October 3, 2024 04:35
Copy link
Collaborator

@czechboy0 czechboy0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, added a few more nits, but this is now ready for CI - kicking off. Great work!

Copy link
Collaborator

@czechboy0 czechboy0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

@czechboy0
Copy link
Collaborator

czechboy0 commented Oct 3, 2024

You'll still need to rerun the formatter - see the updated CONTRIBUTING.md on how to use the act tool to run the format action locally, and commit the changes

@paulhdk
Copy link
Contributor Author

paulhdk commented Oct 3, 2024

You'll still need to rerun the formatter - see the updated CONTRIBUTING.md on how to use the act tool to run the format action locally, and commit the changes

PTAL. Ran the formatter locally. All checks should pass now!

@czechboy0 czechboy0 merged commit d604dd0 into apple:main Oct 3, 2024
18 checks passed
@czechboy0
Copy link
Collaborator

Thanks again for your contribution and for your patience as we've worked through all the details 👏 🙏

@PSchmiedmayer
Copy link

Congratulations @paulhdk; exciting to see this move into SpeziLLM and the work you have been doing there! 🚀

Thank you @czechboy0 & @simonjbeaumont for the support, reviews, and input!

@paulhdk
Copy link
Contributor Author

paulhdk commented Oct 3, 2024

Thanks again for your contribution and for your patience as we've worked through all the details 👏 🙏

Thank you @czechboy0 and @simonjbeaumont for your help! It was fun working this one out and I learnt lots!

@czechboy0
Copy link
Collaborator

It was fun working this one out and I learnt lots!

Glad to hear it! If you're ever looking to contribute more, browse through https://github.com/apple/swift-openapi-generator/issues and comment on anything that catches your eye, and we can work together again 🙂

@paulhdk paulhdk deleted the customisable-terminating-byte-sequence branch October 3, 2024 17:38
czechboy0 added a commit to apple/swift-openapi-generator that referenced this pull request Oct 3, 2024
### Motivation

See: apple/swift-openapi-runtime#115

### Modifications

This PR references the changes introduced by TODO in the documentation.

### Result

Users planning to generate code for OpenAPI specs with custom
terminating byte sequences will hopefully find it easier to make the
necessary changes in their code.

Please let me know if there are other places where these changes could
be mentioned.

### Test Plan
.

---------

Co-authored-by: Honza Dvorsky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver/minor Adds new public API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants